Evaluating Speech Separation Systems

نویسنده

  • Daniel P.W. Ellis
چکیده

Common evaluation standards are critical to making progress in any field, but they can also distort research by shifting all the attention to a limited subset of the problem. Here, we consider the problem of evaluating algorithms for speech separation and acoustic scene analysis, noting some weaknesses of existing measures, and making some suggestions for future evaluations. We take the position that the most relevant ‘ground truth’ for sound mixture organization is the set of sources perceived by human listeners, and that best evaluation standards would measure the machine’s match to this perception at a level abstracted away from the low-level signal features most often considered in signal processing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Joint optimization of recurrent networks exploiting source auto-regression for source separation

In music interferences condition, source separation is very difficult. In this paper, we propose a novel recurrent network exploiting the auto-regressions of speech and music interference for source separation. An auto-regression can capture the shortterm temporal dependencies in data to help the source separation. For the separation, we independently separate the magnitude spectra of speech an...

متن کامل

First-order Differential Beamforming and Joint-process Estimation for Spatial Source Separation

Speech Enhancement is a technique required to grant the success of speech recognition systems working under strong noisy conditions, and to grant understandability in speech transmission and coding. Array beamforming has been traditionally used to produce improvements in the signal-to-noise ratio. Two-sensor systems based on FirstOrder Differential Beamformers (FODB) have been proposed as a pro...

متن کامل

Single channel speech separation in modulation frequency domain based on a novel pitch range estimation method

Computational Auditory Scene Analysis (CASA) has been the focus in recent literature for speech separation from monaural mixtures. The performance of current CASA systems on voiced speech separation strictly depends on the robustness of the algorithm used for pitch frequency estimation. We propose a new system that estimates pitch (frequency) range of a target utterance and separates voiced por...

متن کامل

Application of Over-complete Blind so Automatic Speech Re

Spoken dialogue based information retrieval systems that are used in mobile environments are becoming popular. However, mobile environment is dynamically changing and there exists many interfering signals. These two effects result in degradation in automatic speech recognition (ASR) accuracy and hence, degradation in performance of spoken dialogue based information retrieval systems. One way to...

متن کامل

TasNet: time-domain audio separation network for real-time, single-channel speech separation

Robust speech processing in multi-talker environments requires effective speech separation. Recent deep learning systems have made significant progress toward solving this problem, yet it remains challenging particularly in real-time, short latency applications. Most methods attempt to construct a mask for each source in time-frequency representation of the mixture signal which is not necessari...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004